New York city Is one of the most famous cities in the United States as it is financial, cultural and entertainment centers of the world. Of course, all of these factors are an attraction for tourists around the world to travel to New York. According to Mastercard Global Destination Index report, New York City is one of the 10 Most Visited Cities in 2019. There were 13.6 million foreign tourists that traveled to New York last year. In addition, they visited New York on an average of approximately 8 nights and spent an average of 152 dollars per day.
Airbnb is a way of international visitors to find accommodations for traveling abroad. In 2018, Airbnb has approximately 150 million guest users and 2.9 million hosts around the world, covering in over 191 countries. According to Wachsmuth et al. (2018), New York is the third-largest Airbnb market in the world, with more than 40,000 housing and apartment rental listings, covering in Manhattan, Bronx, Queens, Brooklyn, and Staten Island. Furthermore, they found that New York’s Airbnb revenue increased by 14 percent or jumped to $657 million between 2016 and 2017, in line with an increase of the number of Airbnb guests and New York’s visitors. Taking into an account of all factors, it is undeniable that Airbnb has become a way to find the accommodations in New York City for many tourists around the world. Thus, Airbnb hosts data becomes a hot topic to do the research.
This report, focuses on Airbnb hosts in New York City 2019, contains 6 chapters: chapter 2 includes source data and basic data visualization; …………………
In this study, the CSV data file comes from Inside Airbnb (http://insideairbnb.com/get-the-data.html). This dataset contains 48,865 Airbnb rental providers information in New York City 2019 with 16 variables,including housing ID, accommodation name, host names, host ID, neighborhood group, sub-neighborhoods, latitude, longitude, room type, price, minimum night for rent, the number of reviews, the date of last review, the frequency of reviews, calculated host listing counts, and the number of availabilities in one year. To review the data, str() function in R is used to view a quick snapshot of the data as follows:
## 'data.frame': 48895 obs. of 16 variables:
## $ id : int 2539 2595 3647 3831 5022 5099 5121 5178 5203 5238 ...
## $ name : Factor w/ 47897 levels ""," 1 Bed Apt in Utopic Williamsburg ",..: 12564 38007 45009 15582 19210 24840 8248 24887 15477 17564 ...
## $ host_id : int 2787 2845 4632 4869 7192 7322 7356 8967 7490 7549 ...
## $ host_name : Factor w/ 11453 levels ""," Valéria",..: 4997 4791 2913 6210 5929 1938 3549 9649 6880 1235 ...
## $ neighbourhood_group : Factor w/ 5 levels "Bronx","Brooklyn",..: 2 3 3 2 3 3 2 3 3 3 ...
## $ neighbourhood : Factor w/ 221 levels "Allerton","Arden Heights",..: 109 128 95 42 62 138 14 96 203 36 ...
## $ latitude : num 40.6 40.8 40.8 40.7 40.8 ...
## $ longitude : num -74 -74 -73.9 -74 -73.9 ...
## $ room_type : Factor w/ 3 levels "Entire home/apt",..: 2 1 2 1 1 1 2 2 2 1 ...
## $ price : int 149 225 150 89 80 200 60 79 79 150 ...
## $ minimum_nights : int 1 1 3 1 10 3 45 2 2 1 ...
## $ number_of_reviews : int 9 45 0 270 9 74 49 430 118 160 ...
## $ last_review : Factor w/ 1765 levels "","1/1/13","1/1/15",..: 203 1059 1 1438 348 1234 277 1244 1383 1317 ...
## $ reviews_per_month : num 0.21 0.38 NA 4.64 0.1 0.59 0.4 3.47 0.99 1.33 ...
## $ calculated_host_listings_count: int 6 2 1 1 1 1 1 1 1 4 ...
## $ availability_365 : int 365 355 365 194 0 129 0 220 0 188 ...
To do a basic data visualization, latitude and longitude in the dataset are used to draw the map, which is plotted by leaflet library in R.
To begin with first data visualization reviews, Airbnb hosts are divided by four different price groups to plot the map. Focusing on the below map, the number of Airbnb hosts that are lower than $150 per night (yellow points) are 33,957, accounting for 69.4% of all Airbnb hosts in New York City. In addition, there are 13.894 hosts or 28.4% that are in $151-$500 per night groups (blue points). Furthermore, 805 and 239 hosts are in $501-$1,000 (green points) and higher $1,000 (red points) groups, respectively.
Airbnb hosts can list entire homes/apartments, private or shared rooms. According to the below map, there are 25,409 hosts for entire homes/apartments (blue points), accounting for approximately 51.9% of all Airbnb hosts in New York City. While, there are 22,326 private rooms (yellow points) and only 1,160 shared rooms (red points).
## 'data.frame': 48895 obs. of 16 variables:
## $ id : int 2539 2595 3647 3831 5022 5099 5121 5178 5203 5238 ...
## $ name : Factor w/ 47897 levels ""," 1 Bed Apt in Utopic Williamsburg ",..: 12564 38007 45009 15582 19210 24840 8248 24887 15477 17564 ...
## $ host_id : int 2787 2845 4632 4869 7192 7322 7356 8967 7490 7549 ...
## $ host_name : Factor w/ 11453 levels ""," Valéria",..: 4997 4791 2913 6210 5929 1938 3549 9649 6880 1235 ...
## $ neighbourhood_group : Factor w/ 5 levels "Bronx","Brooklyn",..: 2 3 3 2 3 3 2 3 3 3 ...
## $ neighbourhood : Factor w/ 221 levels "Allerton","Arden Heights",..: 109 128 95 42 62 138 14 96 203 36 ...
## $ latitude : num 40.6 40.8 40.8 40.7 40.8 ...
## $ longitude : num -74 -74 -73.9 -74 -73.9 ...
## $ room_type : Factor w/ 3 levels "Entire home/apt",..: 2 1 2 1 1 1 2 2 2 1 ...
## $ price : int 149 225 150 89 80 200 60 79 79 150 ...
## $ minimum_nights : int 1 1 3 1 10 3 45 2 2 1 ...
## $ number_of_reviews : int 9 45 0 270 9 74 49 430 118 160 ...
## $ last_review : Factor w/ 1765 levels "","1/1/13","1/1/15",..: 203 1059 1 1438 348 1234 277 1244 1383 1317 ...
## $ reviews_per_month : num 0.21 0.38 NA 4.64 0.1 0.59 0.4 3.47 0.99 1.33 ...
## $ calculated_host_listings_count: int 6 2 1 1 1 1 1 1 1 4 ...
## $ availability_365 : int 365 355 365 194 0 129 0 220 0 188 ...
## [1] "id" "name"
## [3] "host_id" "host_name"
## [5] "neighbourhood_group" "neighbourhood"
## [7] "latitude" "longitude"
## [9] "room_type" "price"
## [11] "minimum_nights" "number_of_reviews"
## [13] "last_review" "reviews_per_month"
## [15] "calculated_host_listings_count" "availability_365"
## id name
## Min. : 2539 Hillside Hotel : 18
## 1st Qu.: 9471945 Home away from home : 17
## Median :19677284 : 16
## Mean :19017143 New york Multi-unit building : 16
## 3rd Qu.:29152178 Brooklyn Apartment : 12
## Max. :36487245 Loft Suite @ The Box House Hotel: 11
## (Other) :48805
## host_id host_name neighbourhood_group
## Min. : 2438 Michael : 417 Bronx : 1091
## 1st Qu.: 7822033 David : 403 Brooklyn :20104
## Median : 30793816 Sonder (NYC): 327 Manhattan :21661
## Mean : 67620011 John : 294 Queens : 5666
## 3rd Qu.:107434423 Alex : 279 Staten Island: 373
## Max. :274321313 Blueground : 232
## (Other) :46943
## neighbourhood latitude longitude
## Williamsburg : 3920 Min. :40.50 Min. :-74.24
## Bedford-Stuyvesant: 3714 1st Qu.:40.69 1st Qu.:-73.98
## Harlem : 2658 Median :40.72 Median :-73.96
## Bushwick : 2465 Mean :40.73 Mean :-73.95
## Upper West Side : 1971 3rd Qu.:40.76 3rd Qu.:-73.94
## Hell's Kitchen : 1958 Max. :40.91 Max. :-73.71
## (Other) :32209
## room_type price minimum_nights
## Entire home/apt:25409 Min. : 0.0 Min. : 1.00
## Private room :22326 1st Qu.: 69.0 1st Qu.: 1.00
## Shared room : 1160 Median : 106.0 Median : 3.00
## Mean : 152.7 Mean : 7.03
## 3rd Qu.: 175.0 3rd Qu.: 5.00
## Max. :10000.0 Max. :1250.00
##
## number_of_reviews last_review reviews_per_month
## Min. : 0.00 :10052 Min. : 0.010
## 1st Qu.: 1.00 6/23/19: 1413 1st Qu.: 0.190
## Median : 5.00 7/1/19 : 1359 Median : 0.720
## Mean : 23.27 6/30/19: 1341 Mean : 1.373
## 3rd Qu.: 24.00 6/24/19: 875 3rd Qu.: 2.020
## Max. :629.00 7/7/19 : 718 Max. :58.500
## (Other):33137 NA's :10052
## calculated_host_listings_count availability_365
## Min. : 1.000 Min. : 0.0
## 1st Qu.: 1.000 1st Qu.: 0.0
## Median : 1.000 Median : 45.0
## Mean : 7.144 Mean :112.8
## 3rd Qu.: 2.000 3rd Qu.:227.0
## Max. :327.000 Max. :365.0
##
## id name
## Min. : 2539 Home away from home : 12
## 1st Qu.: 9045427 Loft Suite @ The Box House Hotel: 11
## Median :19175650 #NAME? : 10
## Mean :18313347 Brooklyn Apartment : 9
## 3rd Qu.:27703016 Private Room : 9
## Max. :36455809 Cozy Brooklyn Apartment : 8
## (Other) :32978
## host_id host_name neighbourhood_group
## Min. : 2571 Michael: 267 Bronx : 826
## 1st Qu.: 7248357 David : 256 Brooklyn :14526
## Median : 29464755 John : 209 Manhattan :13194
## Mean : 64562362 Alex : 184 Queens : 4192
## 3rd Qu.:101978485 Sarah : 170 Staten Island: 299
## Max. :273841667 Maria : 155
## (Other):31796
## neighbourhood latitude longitude
## Bedford-Stuyvesant: 2806 Min. :40.51 Min. :-74.24
## Williamsburg : 2776 1st Qu.:40.69 1st Qu.:-73.98
## Harlem : 1950 Median :40.72 Median :-73.95
## Bushwick : 1767 Mean :40.73 Mean :-73.95
## East Village : 1287 3rd Qu.:40.76 3rd Qu.:-73.93
## Hell's Kitchen : 1191 Max. :40.91 Max. :-73.71
## (Other) :21260
## room_type price minimum_nights number_of_reviews
## Entire home/apt:16086 Min. : 0 Min. : 1.000 Min. : 1.00
## Private room :16191 1st Qu.: 65 1st Qu.: 1.000 1st Qu.: 3.00
## Shared room : 760 Median :100 Median : 2.000 Median : 11.00
## Mean :118 Mean : 2.661 Mean : 31.73
## 3rd Qu.:150 3rd Qu.: 3.000 3rd Qu.: 37.00
## Max. :334 Max. :11.000 Max. :629.00
##
## last_review reviews_per_month calculated_host_listings_count
## 6/23/19: 1311 Min. : 0.010 Min. : 1.000
## 7/1/19 : 1242 1st Qu.: 0.210 1st Qu.: 1.000
## 6/30/19: 1222 Median : 0.850 Median : 1.000
## 6/24/19: 801 Mean : 1.482 Mean : 3.343
## 7/7/19 : 674 3rd Qu.: 2.230 3rd Qu.: 2.000
## 7/2/19 : 614 Max. :58.500 Max. :327.000
## (Other):27173
## availability_365
## Min. : 0.0
## 1st Qu.: 0.0
## Median : 38.0
## Mean :103.7
## 3rd Qu.:190.0
## Max. :365.0
##
##
## Call:
## lm(formula = price ~ minimum_nights + number_of_reviews + reviews_per_month +
## calculated_host_listings_count + availability_365, data = nyc4)
##
## Residuals:
## Min 1Q Median 3Q Max
## -124.86 -50.88 -16.84 35.82 227.07
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 109.646231 0.839378 130.628 < 2e-16 ***
## minimum_nights 2.810407 0.210163 13.373 < 2e-16 ***
## number_of_reviews 0.005585 0.008675 0.644 0.52
## reviews_per_month -1.155127 0.256644 -4.501 6.79e-06 ***
## calculated_host_listings_count 0.304686 0.016975 17.949 < 2e-16 ***
## availability_365 0.013871 0.003042 4.560 5.14e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 65.71 on 33031 degrees of freedom
## Multiple R-squared: 0.01742, Adjusted R-squared: 0.01727
## F-statistic: 117.1 on 5 and 33031 DF, p-value: < 2.2e-16
## minimum_nights number_of_reviews
## 1.076810 1.463123
## reviews_per_month calculated_host_listings_count
## 1.531107 1.022583
## availability_365
## 1.108524
All variables except number_of_reviews are statistically significant at the one percent level. The intercept suggests that 107 dollars is the mean price of an Air B&B. The coefficent on minimum_nights suggests that a one unit increase in the minimum number of nights increases the price by 3.58 dollars. The coefficient on reviews_per_month suggests that a one unit increase in reviews per month decreases price by 1.09 dollars. The coefficient on calculated_host_listings_count suggests that a one unit increase in the count of calculated host listings increases price by .31 cents. The coefficient on availability_365 suggests that a one unit increase in availability throughout the year increases price by .01 cents. The VIF test suggests that multicollinearity is not a concern in this linear model.
Are the Airbnb prices the same across New York city neighbourhood groups and among different room types?
H0: The prices are the same across NYC Neighbourhood Groups. H1: There are significant differences in prices across NYC Neighbourhood Groups. We use ANOVA to test this hypothesis. Below are the summary of the results.
outlierKD <- function(dt, var, rmv=NULL) {
var_name <- eval(substitute(var),eval(dt))
na1 <- sum(is.na(var_name))
m1 <- mean(var_name, na.rm = T)
sd1 <- sd(var_name,na.rm = T)
par(mfrow=c(2, 2), oma=c(0,0,3,0))
boxplot(var_name, main="With outliers")
hist(var_name, main="With outliers", xlab=NA, ylab=NA)
outlier <- boxplot.stats(var_name)$out
mo <- mean(outlier)
var_name <- ifelse(var_name %in% outlier, NA, var_name)
boxplot(var_name, main="Without outliers")
hist(var_name, main="Without outliers", xlab=NA, ylab=NA)
title("Outlier Check", outer=TRUE)
na2 <- sum(is.na(var_name))
cat("Outliers identified:", na2 - na1, "n")
cat("Propotion (%) of outliers:", round((na2 - na1) / sum(!is.na(var_name))*100, 1), "n")
cat("Mean of the outliers:", round(mo, 2), "n")
m2 <- mean(var_name, na.rm = T)
cat("Mean without removing outliers:", round(m1, 2), "n")
cat("Mean if we remove outliers:", round(m2, 2), "n")
#
if(is.null(rmv)) {
response <- readline(prompt="Do you want to remove outliers and to replace with NA? [yes/no]: ")
} else {
if (rmv=='y'|rmv=='yes'|rmv=='Y'|rmv=='Yes'|rmv=='YES'|rmv==TRUE ) { response = 'y' } else { response = 'n' }
}
#
if(response == "y" | response == "yes"){
dt[as.character(substitute(var))] <- invisible(var_name)
assign(as.character(as.list(match.call())$dt), dt, envir = .GlobalEnv)
cat("Outliers successfully removed", "n")
return(invisible(dt))
} else{
cat("Nothing changed", "n")
return(invisible(var_name))
}
}
outlierKD(d, price,'y')
## Outliers identified: 2972 nPropotion (%) of outliers: 6.5 nMean of the outliers: 658.78 nMean without removing outliers: 152.72 nMean if we remove outliers: 119.97 nOutliers successfully removed n
summary(d)
## neighbourhood_group room_type price
## Bronx : 1091 Entire home/apt:25409 Min. : 0
## Brooklyn :20104 Private room :22326 1st Qu.: 65
## Manhattan :21661 Shared room : 1160 Median :100
## Queens : 5666 Mean :120
## Staten Island: 373 3rd Qu.:159
## Max. :334
## NA's :2972
d1 <- na.omit(d)
summary(d1)
## neighbourhood_group room_type price
## Bronx : 1070 Entire home/apt:22789 Min. : 0
## Brooklyn :19415 Private room :21996 1st Qu.: 65
## Manhattan :19506 Shared room : 1138 Median :100
## Queens : 5567 Mean :120
## Staten Island: 365 3rd Qu.:159
## Max. :334
## Df Sum Sq Mean Sq F value Pr(>F)
## neighbourhood_group 4 24781929 6195482 1509 <2e-16 ***
## Residuals 45918 188500166 4105
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Loading required package: ggplot2
## Loading required package: magrittr
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = price ~ neighbourhood_group, data = d1)
##
## $neighbourhood_group
## diff lwr upr p adj
## Brooklyn-Bronx 28.3341931 22.845989 33.822397 0.0000000
## Manhattan-Bronx 68.5874145 63.099879 74.074950 0.0000000
## Queens-Bronx 11.5390163 5.705153 17.372880 0.0000007
## Staten Island-Bronx 11.8701959 1.276184 22.464208 0.0190278
## Manhattan-Brooklyn 40.2532213 38.481432 42.025010 0.0000000
## Queens-Brooklyn -16.7951768 -19.452272 -14.138082 0.0000000
## Staten Island-Brooklyn -16.4639973 -25.697593 -7.230402 0.0000114
## Queens-Manhattan -57.0483982 -59.704112 -54.392684 0.0000000
## Staten Island-Manhattan -56.7172186 -65.950417 -47.484021 0.0000000
## Staten Island-Queens 0.3311796 -9.111959 9.774318 0.9999811
5.3 Are prices the same across Airbnd Room Types? H0: The prices are the same across Airbnd Room Types. H1: There are significant differences in prices across Airbnd Room Types. We use ANOVA to test this hypothesis. Below are the summary of the results.
summary(d$room_type)
## Entire home/apt Private room Shared room
## 25409 22326 1160
## Df Sum Sq Mean Sq F value Pr(>F)
## d$room_type 2 82350853 41175426 14441 <2e-16 ***
## Residuals 45920 130931242 2851
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 2972 observations deleted due to missingness
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = d$price ~ d$room_type, data = d1)
##
## $`d$room_type`
## diff lwr upr p adj
## Private room-Entire home/apt -83.50859 -84.69151 -82.32568 0
## Shared room-Entire home/apt -103.23360 -107.03491 -99.43229 0
## Shared room-Private room -19.72501 -23.52957 -15.92044 0
Here, we are trying to anwer the question “What is the least available neighborhood group? and how can we rank their availabilities?”. If we rank the Neighborhood groups based on availability, the most popular Neighborhood Group is Brooklyn, followed by Manhattan, Queens, Bronx and Staten Island respectively.
## Bronx Brooklyn Manhattan Queens Staten Island
## 1091 20104 21661 5666 373
If we rank the Neighborhood groups based on availability, the least available Neighborhood Group is Brooklyn, followed by Manhattan, Queens, Bronx and Staten Island respectively.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 0.0 28.0 100.2 188.0 365.0
Lets look at the least availabile neighborhood group, Brooklyn. The median for Brooklyn is 28 days, meaning half of the neighborhoods in Brooklyn’s availability is less than 28 days of the 365days. In short, it is available 7.6712329 % of the time.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 36 112 230 365
Lets look at the second least availabile neighborhood group, Manhattan The median for Manhattan is 36 days, meaning half of the neighborhoods in Manhattan’s availability is less than 36 days of the 365days. In short, it is available 9.8630137 % of the time.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 2.0 98.0 144.5 286.0 365.0
Lets look at the third least availabile neighborhood group, Queens The median for Queens is 98 days, meaning half of the neighborhoods in Queens’s availability is less than 98 days of the 365days. In short, it is available 26.8493151 % of the time.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 37.0 148.0 165.8 313.5 365.0
Lets look at the fourth least availabile neighborhood group, Bronx The median for Bronx is 148 days, meaning half of the neighborhoods in Bronx’s availability is less than 148 days of the 365days. In short, it is available 40.5479452 % of the time.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 78.0 219.0 199.7 333.0 365.0
Lets look at the most availabile neighborhood group, Staten Island The median for Staten Island is 219 days, meaning half of the neighborhoods in Staten Island’s availability is less than 219 days of the 365days. In short, it is available 60 % of the time.
Based on the information in their availabilities, are the difference significant enough accross the neighborhood groups?
## Df Sum Sq Mean Sq F value Pr(>F)
## neighbourhood_group 4 14741568 3685392 216.5 <2e-16 ***
## Residuals 48890 832318962 17024
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = availability_365 ~ neighbourhood_group, data = Airbnbdata)
##
## $neighbourhood_group
## diff lwr upr p adj
## Brooklyn-Bronx -65.52664 -76.590498 -54.462791 0.0000000
## Manhattan-Bronx -53.77953 -64.822893 -42.736160 0.0000000
## Queens-Bronx -21.30712 -33.074224 -9.540014 0.0000078
## Staten Island-Bronx 33.91935 12.571845 55.266850 0.0001425
## Manhattan-Brooklyn 11.74712 8.261586 15.232650 0.0000000
## Queens-Brooklyn 44.21953 38.866233 49.572819 0.0000000
## Staten Island-Brooklyn 99.44599 80.847367 118.044617 0.0000000
## Queens-Manhattan 32.47241 27.161586 37.783230 0.0000000
## Staten Island-Manhattan 87.69887 69.112429 106.285319 0.0000000
## Staten Island-Queens 55.22647 36.201095 74.251837 0.0000000
We can see the variation of the means and medians of availabilities of each neighborhood group and we perforemed ANNOVA testing to see if there is a true difference between these values. We found out that the p-value is less than 0.05 and there fore we reject the null hypothesis. This means there is a significant difference between the neighborhood groups in terms of their availability.
We also performend a tukey test, and we can see that the p-values are less than 0.5 which means their variations are significant.
## Entire home/apt Private room Shared room
## 25409 22326 1160
We see the number of data collected for each room type. Based on room type, the most available is a shared room. Private rooms and Entire Homes/Apartments are almost equally available. We will also determine if the difference in means is significant.
## availability_365
## Min. : 0.0
## 1st Qu.: 0.0
## Median : 42.0
## Mean :111.9
## 3rd Qu.:229.0
## Max. :365.0
## availability_365
## Min. : 0.0
## 1st Qu.: 0.0
## Median : 45.0
## Mean :111.2
## 3rd Qu.:214.0
## Max. :365.0
## Df Sum Sq Mean Sq F value Pr(>F)
## room_type 2 2884561 1442280 83.53 <2e-16 ***
## Residuals 48892 844175969 17266
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = availability_365 ~ room_type, data = Airbnbdata)
##
## $room_type
## diff lwr upr p adj
## Private room-Entire home/apt -0.7163712 -3.541374 2.108632 0.8231684
## Shared room-Entire home/apt 50.0805582 40.834331 59.326785 0.0000000
## Shared room-Private room 50.7969294 41.522872 60.070987 0.0000000
We can see the variation of the means and medians of availabilities of each room type and we perforemed ANNOVA testing to see if there is a true difference between these values. We found out that the p-value is less than 0.05 and there fore we reject the null hypothesis. This means there is a significant difference between the room types in terms of their availability.
We also performend a tukey test, and we can see that the p-value for private room and entire home is more than 0.5. Their variations are insignificant and there is no true difference between their means.